Goto

Collaborating Authors

 linear dependence






Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms

Yunwen Lei, Urun Dogan, Alexander Binder, Marius Kloft

Neural Information Processing Systems

This paper studies the generalization performance of multi-class classification algorithms, for which we obtain--for the first time--a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis.


Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Neural Information Processing Systems

We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e.


Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Deanna Needell, Rachel Ward, Nati Srebro

Neural Information Processing Systems

We improve a recent guarantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e.


Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Neural Information Processing Systems

We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods.


Reviews: A Sample Complexity Measure with Applications to Learning Optimal Auctions

Neural Information Processing Systems

Summary: Rademacher complexity is a powerful tool for producing generalization guarantees. The Rademacher complexity of a class H on a sample S is roughly the expected maximum gap between training and test performance if S were randomly partitioned into training and test sets, and we took the maximum over all h in H (eqn 11). This paper makes the observation that the core steps in the standard generalization bound proof will still go through if instead of taking the max over all h in H, you only look at the h's that your procedure can possibly output when given half of a double sample (Lemma 1). While unfortunately the usual final (or initial) high-probability step in this argument does not seem to go through directly, the paper shows (Theorem 2) that one can nonetheless get a useful generation bound from this using other means. The paper then shows how this generalization bound yields good sample complexity guarantees for a number of natural auction classes.


Identifying Nonstationary Causal Structures with High-Order Markov Switching Models

Balsells-Rodas, Carles, Wang, Yixin, Mediano, Pedro A. M., Li, Yingzhen

arXiv.org Machine Learning

Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first establish identifiability for high-order Markov Switching Models, which provide the foundations for identifiable regime-dependent causal discovery. Our empirical studies demonstrate the scalability of our proposed approach for high-order regime-dependent structure estimation, and we illustrate its applicability on brain activity data.